Speaker tracking and detection with multiple speakers
نویسندگان
چکیده
We describe a speaker tracking and detection system, for Switchboard conversations, that uses a two-speaker and silence hidden Markov model (HMM) with a minimum state duration constraint and Gaussian mixture model (GMM) state distributions adapted from a single genderand handset-independent imposter model distribution. Speaker tracking is used to segment speakers for detection, which is carried out by averaging frame scores of the Viterbi path and HNORM’ing via a novel parameter interpolation extension of HNORM for use with files of arbitrary lengths. Use of duration statistics augmenting the acoustic scores is also introduced via a nonlinear combination function. Results are reported on the NIST 1998 Multispeaker development evaluation dataset.
منابع مشابه
Speaker Tracking in Seminars by Human Body Detection
This paper presents evaluation results of a method for tracking speakers in seminars from multiple cameras. First, 2D human tracking and detection is done for each view. Then, 2D locations are converted to 3D based on the calibration parameters. Finally, cues from multiple cameras are integrated in a incremental way to refine the trajectories. We have developed two multi-view integration method...
متن کاملA speaker tracking system based on speaker turn detection for NIST evaluation
A speaker tracking system (STS) is built by using successively a speaker change detector and a speaker verication system. The aim of the STS is to nd in a conversation between several persons (some of them having already enrolled and other being totally unknown) target speakers chosen in a set of enrolled users. In a rst step, speech is segmented into homogeneous segments containing only one sp...
متن کاملJointly Tracking and Separating Speech Sources Using Multiple Features and the generalized labeled multi-Bernoulli Framework
This paper proposes a novel joint multi-speaker tracking-andseparation method based on the generalized labeled multi-Bernoulli (GLMB) multi-target tracking filter, using sound mixtures recorded by microphones. Standard multi-speaker tracking algorithms usually only track speaker locations, and ambiguity occurs when speakers are spatially close. The proposed multi-feature GLMB tracking filter tr...
متن کاملOnline blind speech separation using multiple acoustic speaker tracking and time-frequency masking
Separating speech signals of multiple simultaneous talkers in a reverberant enclosure is known as the cocktail party problem. In real-time applications online solutions capable of separating the signals as they are observed are required in contrast to separating the signals offline after observation. Often a talker may move, which should also be considered by the separation system. This work pr...
متن کاملOnline multiple people tracking-by-detection in crowded scenes
Multiple people detection and tracking is a challenging task in real-world crowded scenes. In this paper, we have presented an online multiple people tracking-by-detection approach with a single camera. We have detected objects with deformable part models and a visual background extractor. In the tracking phase we have used a combination of support vector machine (SVM) person-specific classifie...
متن کامل